Evaluation and Sociolinguistic Analysis of Text Features for Gender and Age Identification
نویسندگان
چکیده
Corresponding Author: Vasiliki Simaki Centre for Languages and Literature, Lund University, Lund, Sweden Email: [email protected] Abstract: The paper presents an interdisciplinary study in the field of automatic gender and age identification, under the scope of sociolinguistic knowledge on gendered and age linguistic choices that social media users make. The authors investigated and gathered standard and novel text features used in text mining approaches on the author’s demographic information and profiling and they examined their efficacy in gender and age detection tasks on a corpus consisted of social media texts. An analysis of the most informative features is attempted according to the nature of each feature and the information derived after the characteristics’ score of importance is discussed.
منابع مشابه
Author gender identification from text using Bayesian Random Forest
Nowadays high usage of users from virtual environments and their connection via social networks like Facebook, Instagram, and Twitter shows the necessity of finding out shared subjects in this environment more than before. There are several applications that benefit from reliable methods for inferring age and gender of users in social media. Such applications exist across a wide area of fields,...
متن کاملAge Identification of Twitter Users: Classification Methods and Sociolinguistic Analysis
In this article, we address the problem of age identification of Twitter users, after their online text. We used a set of text mining, sociolinguistic-based and content-related text features, and we evaluated a number of well-known and widely used machine learning algorithms for classification, in order to examine their appropriateness on this task. The experimental results showed that Random F...
متن کاملUsing Sociolinguistic Inspired Features for Gender Classification of Web Authors
In this article we present a methodology for classification of text from web authors, using sociolinguistic inspired text features. The proposed methodology uses a baseline text mining based feature set, which is combined with text features that quantify results from theoretical and sociolinguistic studies. Two combination approaches were evaluated and the evaluation results indicated a signifi...
متن کاملA Sociolinguistic Scrutiny of the Great Gatsby and its Persian Translation in Light of Hatim and Mason’s Framework
Translation studies essentially deals with a socio-communicatively driven and contextualized enterprise. Viewed hence, it seems that no discipline tends to provide the possibility of studying the interrelations between interlocutors to generate meaning within the interactive social context as precisely as sociolinguistics (Federici, 2018). A sociolinguistic approach to translation seems to be i...
متن کاملA Document Weighted Approach for Gender and Age Prediction Based on Term Weight Measure
Author profiling is a text classification technique, which is used to predict the profiles of unknown text by analyzing their writing styles. Author profiles are the characteristics of the authors like gender, age, nativity language, country and educational background. The existing approaches for Author Profiling suffered from problems like high dimensionality of features and fail to capture th...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017